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AMENDMENTS TO THE CLAIMS 

1 . (previously presented) A method implemented in a computer system, for 
clustering a string, the string including a plurality of characters, the method 
including: 

identifying R unique n-grams Ti .. R in the string; 
for every unique n-gram T s : 

if the frequency of T s in a set of n-gram statistics is not greater than a 
first threshold: 
clustering the string with a cluster associated with T s ; 
otherwise: 

for every other n-gram T v in the string Ti ... Ri eX cept s- 

concluding that the frequency of n-gram T v is greater than the 
first threshold, and in response: 
if the frequency of n-gram pair T s -T v is not greater than a 
second threshold: 
clustering the string with a cluster associated with the n- 
gram pair T s -T v ; 

otherwise: 

for every other n-gram T x in the string T L .. R) excep t s and v: 
clustering the string with a cluster associated with 
the n-gram triple T S -T V -T X ; 
where Tj . R is a set of n-grams, R is the number of elements in 
Ti...r, and T s , T v , and T x are members of T h „ R , and S, V, 
and X are integer indexes to identify members of Tj . R . 

2. (original) The method of claim 1 further including compiling n-gram statistics. 
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3. (original) The method of claim 1 further including compiling n-gram pair 
statistics. 

4. (canceled) 

5. (canceled) 

6. (previously presented) A method implemented in a computer system, for 
clustering a string, the string including a plurality of characters, the method 
including: 

identifying R unique n-grams T L „ R in the string; 
for every unique n-gram T s : 

if the frequency of T s in a set of n-gram statistics is not greater than a 
first threshold: 
clustereing the string with a cluster associated with T s ; 
otherwise: 

fori=l to Y: 

for every unique set of i n-grams Tu in the string Ti.. R> excep t s- 

if the frequency of the n-gram set T s -Tu is not greater than a 
second threshold: 

clustering the string with a cluster associated with the 
n-gram set T s -Tu; 
if the string has not been associated with a cluster with this value of T s : 
for every unique set of Y+l n-grams Tyy in the string T lt . Rf excep t s- 
clustering the string with a cluster associated with the 
Y+2 n-gram group T s -Tuy, 
where Ti.. R is a set of n-grams, R is the number of elements in 
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Ti„. R , and T s , T v , and T x are members of T L .. R , and S, V, 
and X are integer indexes to identify members of Tj r. 

7. (original) The method of claim 6 where Y = 1. 

8. (original) The method of claim 6 further including compiling n-gram statistics. 

9. (original) The method of claim 6 further including compiling n-gram group 
statistics. 

10. (currently amended) A computer program, stored on a tangible storage 
m e dium, for us e in An article comprising a computer-readable storage medium 
having a computer program stored thereon for clustering a string, the program 
including executable instructions that cause a computer to: 

identify R unique n-grams T L >R in the string; 
for every unique n-gram T s : 

if the frequency of T s in a set of n-gram statistics is not greater than a 
first threshold: 
clustering the string with a cluster associated with T s ; 
otherwise: 

for every other n-gram T v in the string T x _ Ri except s- 
concluding that the frequency of n-gram T v is greater than the first 
threshold and in response: 
if the frequency of n-gram pair T s -T v is not greater than a 
second threshold: 
clustering the string with a cluster associated with the n- 
gram pair T s -T v ; 
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otherwise 

for every other n-gram T x in the string T,. .. R> ex ce P t s and v: 
cluster the string with a cluster associated with 
the n-gram triple T s -T v -T x ; 
where T K .. R is a set of n-grams, R is the number of elements in 
Ti...r, and T s , T v , and T x are members of Tj R , and S, V, 
and X are integer indexes to identify members of T! . R . 



1 1 . (currently amended) The comput e r program article of claim 10 further 
including executable instructions that cause a computer to compile n-gram 
statistics. 



12. (currently amended) The comput e r program article of claim 10 further 
including executable instructions that cause a computer to compile n-gram pair 
statistics. 
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